AITopics | benign image

Collaborating Authors

benign image

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Censored Sampling of Diffusion Models Using 3 Minutes of Human Feedback T aeHo Y oon

Neural Information Processing SystemsFeb-16-2026, 08:14:11 GMT

Diffusion models have recently shown remarkable success in high-quality image generation. Sometimes, however, a pre-trained diffusion model exhibits partial misalignment in the sense that the model can generate good images, but it sometimes outputs undesirable images. If so, we simply need to prevent the generation of the bad images, and we call this task censoring. In this work, we present censored generation with a pre-trained diffusion model using a reward model trained on minimal human feedback. We show that censoring can be accomplished with extreme human feedback efficiency and that labels generated with a mere few minutes of human feedback are sufficient.

artificial intelligence, machine learning, malign image, (14 more...)

Neural Information Processing Systems

Country:

Asia > South Korea > Seoul > Seoul (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Law > Civil Rights & Constitutional Law (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ed519c02f134f2cdd836cba387b6a3c8-Supplemental.pdf

Neural Information Processing SystemsFeb-11-2026, 18:48:04 GMT

accuracy, adversarial patch, benign image, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.50)

Add feedback

ed519c02f134f2cdd836cba387b6a3c8-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 18:48:00 GMT

accuracy, adversarial patch, neuron, (16 more...)

Neural Information Processing Systems

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.69)

Industry: Information Technology > Security & Privacy (0.94)

Technology:

Information Technology > Security & Privacy (0.94)
Information Technology > Artificial Intelligence > Vision (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

a5755ccd0efeca8852ae0a1193f319f6-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 03:44:09 GMT

artificial intelligence, machine learning, natural language, (15 more...)

Neural Information Processing Systems

Country:

Asia > South Korea > Seoul > Seoul (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.67)

Add feedback

ed519c02f134f2cdd836cba387b6a3c8-Supplemental.pdf

Neural Information Processing SystemsAug-18-2025, 14:51:31 GMT

artificial intelligence, benign image, machine learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.30)

Add feedback

ed519c02f134f2cdd836cba387b6a3c8-Paper.pdf

Neural Information Processing SystemsAug-18-2025, 14:51:28 GMT

accuracy, artificial intelligence, machine learning, (19 more...)

Neural Information Processing Systems

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.69)

Industry: Information Technology > Security & Privacy (0.94)

Technology:

Information Technology > Security & Privacy (0.94)
Information Technology > Artificial Intelligence > Vision (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

On the Feasibility of Poisoning Text-to-Image AI Models via Adversarial Mislabeling

Wu, Stanley, Bhaskar, Ronik, Ha, Anna Yoo Jeong, Shan, Shawn, Zheng, Haitao, Zhao, Ben Y.

arXiv.org Artificial IntelligenceJun-30-2025

Today's text-to-image generative models are trained on millions of images sourced from the Internet, each paired with a detailed caption produced by Vision-Language Models (VLMs). This part of the training pipeline is critical for supplying the models with large volumes of high-quality image-caption pairs during training. However, recent work suggests that VLMs are vulnerable to stealthy adversarial attacks, where adversarial perturbations are added to images to mislead the VLMs into producing incorrect captions. In this paper, we explore the feasibility of adversarial mislabeling attacks on VLMs as a mechanism to poisoning training pipelines for text-to-image models. Our experiments demonstrate that VLMs are highly vulnerable to adversarial perturbations, allowing attackers to produce benign-looking images that are consistently miscaptioned by the VLM models. This has the effect of injecting strong "dirty-label" poison samples into the training pipeline for text-to-image models, successfully altering their behavior with a small number of poisoned samples. We find that while potential defenses can be effective, they can be targeted and circumvented by adaptive attackers. This suggests a cat-and-mouse game that is likely to reduce the quality of training data and increase the cost of text-to-image model development. Finally, we demonstrate the real-world effectiveness of these attacks, achieving high attack success (over 73%) even in black-box scenarios against commercial VLMs (Google Vertex AI and Microsoft Azure).

caption, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3719027.3744845

2506.21874

Country: North America > United States > Illinois > Cook County > Chicago (0.05)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Add feedback

Object-oriented backdoor attack against image captioning

Li, Meiling, Zhong, Nan, Zhang, Xinpeng, Qian, Zhenxing, Li, Sheng

arXiv.org Artificial IntelligenceJan-4-2024

Backdoor attack against image classification task has been widely studied and proven to be successful, while there exist little research on the backdoor attack against vision-language models. In this paper, we explore backdoor attack towards image captioning models by poisoning training data. Assuming the attacker has total access to the training dataset, and cannot intervene in model construction or training process. Specifically, a portion of benign training samples is randomly selected to be poisoned. Afterwards, considering that the captions are usually unfolded around objects in an image, we design an object-oriented method to craft poisons, which aims to modify pixel values by a slight range with the modification number proportional to the scale of the current detected object region. After training with the poisoned data, the attacked model behaves normally on benign images, but for poisoned images, the model will generate some sentences irrelevant to the given image. The attack controls the model behavior on specific test images without sacrificing the generation performance on benign test images. Our method proves the weakness of image captioning models to backdoor attack and we hope this work can raise the awareness of defending against backdoor attack in the image captioning field.

backdoor attack, caption, dataset, (15 more...)

arXiv.org Artificial Intelligence

2401.026

Country: Asia > China > Shanghai > Shanghai (0.05)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.63)

Add feedback

DataElixir: Purifying Poisoned Dataset to Mitigate Backdoor Attacks via Diffusion Models

Zhou, Jiachen, Lv, Peizhuo, Lan, Yibing, Meng, Guozhu, Chen, Kai, Ma, Hualong

arXiv.org Artificial IntelligenceDec-19-2023

Dataset sanitization is a widely adopted proactive defense against poisoning-based backdoor attacks, aimed at filtering out and removing poisoned samples from training datasets. However, existing methods have shown limited efficacy in countering the ever-evolving trigger functions, and often leading to considerable degradation of benign accuracy. In this paper, we propose DataElixir, a novel sanitization approach tailored to purify poisoned datasets. We leverage diffusion models to eliminate trigger features and restore benign features, thereby turning the poisoned samples into benign ones. Specifically, with multiple iterations of the forward and reverse process, we extract intermediary images and their predicted labels for each sample in the original dataset. Then, we identify anomalous samples in terms of the presence of label transition of the intermediary images, detect the target label by quantifying distribution discrepancy, select their purified images considering pixel and feature distance, and determine their ground-truth labels by training a benign model. Experiments conducted on 9 popular attacks demonstrates that DataElixir effectively mitigates various complex attacks while exerting minimal impact on benign accuracy, surpassing the performance of baseline defense methods.

dataset, diffusion model, target label, (14 more...)

arXiv.org Artificial Intelligence

2312.11057

Country: